10 research outputs found

    Efficient regularized isotonic regression with application to gene--gene interaction search

    Full text link
    Isotonic regression is a nonparametric approach for fitting monotonic models to data that has been widely studied from both theoretical and practical perspectives. However, this approach encounters computational and statistical overfitting issues in higher dimensions. To address both concerns, we present an algorithm, which we term Isotonic Recursive Partitioning (IRP), for isotonic regression based on recursively partitioning the covariate space through solution of progressively smaller "best cut" subproblems. This creates a regularized sequence of isotonic models of increasing model complexity that converges to the global isotonic regression solution. The models along the sequence are often more accurate than the unregularized isotonic regression model because of the complexity control they offer. We quantify this complexity control through estimation of degrees of freedom along the path. Success of the regularized models in prediction and IRPs favorable computational properties are demonstrated through a series of simulated and real data experiments. We discuss application of IRP to the problem of searching for gene--gene interactions and epistasis, and demonstrate it on data from genome-wide association studies of three common diseases.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS504 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Privacy and Fairness in Recommender Systems via Adversarial Training of User Representations

    Full text link
    Latent factor models for recommender systems represent users and items as low dimensional vectors. Privacy risks of such systems have previously been studied mostly in the context of recovery of personal information in the form of usage records from the training data. However, the user representations themselves may be used together with external data to recover private user information such as gender and age. In this paper we show that user vectors calculated by a common recommender system can be exploited in this way. We propose the privacy-adversarial framework to eliminate such leakage of private information, and study the trade-off between recommender performance and leakage both theoretically and empirically using a benchmark dataset. An advantage of the proposed method is that it also helps guarantee fairness of results, since all implicit knowledge of a set of attributes is scrubbed from the representations used by the model, and thus can't enter into the decision making. We discuss further applications of this method towards the generation of deeper and more insightful recommendations.Comment: International Conference on Pattern Recognition and Method

    Low Communication Complexity Protocols, Collision Resistant Hash Functions and Secret Key-Agreement Protocols

    Get PDF
    We study communication complexity in computational settings where bad inputs may exist, but they should be hard to find for any computationally bounded adversary. We define a model where there is a source of public randomness but the inputs are chosen by a computationally bounded adversarial participant after seeing the public randomness. We show that breaking the known communication lower bounds of the private coins model in this setting is closely connected to known cryptographic assumptions. We consider the simultaneous messages model and the interactive communication model and show that for any non trivial predicate (with no redundant rows, such as equality): 1. Breaking the Ω(n) \Omega(\sqrt n) bound in the simultaneous message case or the Ω(logn) \Omega(\log n) bound in the interactive communication case, implies the existence of distributional collision-resistant hash functions (dCRH). This is shown using techniques from Babai and Kimmel (CCC \u2797). Note that with a CRH the lower bounds can be broken. 2. There are no protocols of constant communication in this preset randomness settings (unlike the plain public randomness model). The other model we study is that of a stateful ``free talk , where participants can communicate freely before the inputs are chosen and may maintain a state, and the communication complexity is measured only afterwards. We show that efficient protocols for equality in this model imply secret key-agreement protocols in a constructive manner. On the other hand, secret key-agreement protocols imply optimal (in terms of error) protocols for equality

    Privacy-Preserving Decision Tree Training and Prediction against Malicious Server

    Get PDF
    Privacy-preserving machine learning enables secure outsourcing of machine learning tasks to an untrusted service provider (server) while preserving the privacy of the user\u27s data (client). Attaining good concrete efficiency for complicated machine learning tasks, such as training decision trees, is one of the challenges in this area. Prior works on privacy-preserving decision trees required the parties to have comparable computational resources, and instructed the client to perform computation proportional to the complexity of the entire task. In this work we present new protocols for privacy-preserving decision trees, for both training and prediction, achieving the following desirable properties: 1. Efficiency: the client\u27s complexity is independent of the training-set size during training, and of the tree size during prediction. 2. Security: privacy holds against malicious servers. 3. Practical usability: high accuracy, fast prediction, and feasible training demonstrated on standard UCI datasets, encrypted with fully homomorphic encryption. To the best of our knowledge, our protocols are the first to offer all these properties simultaneously. The core of our work consists of two technical contributions. First, a new low-degree polynomial approximation for functions, leading to faster protocols for training and prediction on encrypted data. Second, a design of an easy-to-use mechanism for proving privacy against malicious adversaries that is suitable for a wide family of protocols, and in particular, our protocols; this mechanism could be of independent interest

    CHIP and CRISP: Protecting All Parties Against Compromise through Identity-Binding PAKEs

    Get PDF
    Recent advances in password-based key exchange (PAKE) protocols can offer stronger security guarantees for globally deployed security protocols. Notably, the OPAQUE protocol realizes saPAKE [Eurocrypt2018], strengthening the protection offered by aPAKE to compromised servers: after compromising an saPAKE server, the adversary still has to perform a full brute-force search to recover any passwords or impersonate users. However, (s)aPAKEs do not protect client storage, and can only be applied in the so-called asymmetric setting, in which some parties, such as servers, do not communicate with each other. Nonetheless, passwords are also widely used in symmetric settings, where a group of parties share a password and can all communicate (e.g., Wi-Fi with client devices, routers, and mesh nodes; or industrial IoT scenarios). In these settings, the (s)aPAKE techniques cannot be applied, and the state-of-the-art still involves handling plaintext passwords. In this work, we propose the notions of (strong) identity-binding PAKEs that improve this situation in two dimensions: they protect all parties from compromise, and can also be applied in the symmetric setting. We propose stronger counterparts to state-of-the-art security notions from the asymmetric setting in the UC model, and construct protocols that provably realize them. Our constructions bind the local storage of all parties to abstract identities, building on ideas from identity-based key exchange, but without requiring a third party. Our first protocol, CHIP, generalizes the security of aPAKE protocols to all parties, forcing the adversary to perform a brute-force search to recover passwords or impersonate others. Our second protocol, CRISP, additionally renders any adversarial pre-computation useless, thereby offering saPAKE-like guarantees for all parties, instead of only the server. We evaluate prototype implementations of our protocols and show that even though they offer stronger security, their performance is in line with, or even better than, state-of-the-art protocols

    The longitudinal structure of negative symptoms in treatment resistant schizophrenia

    No full text
    Background and hypothesis: The negative symptoms of schizophrenia are strong prognostic factors but remain poorly understood and treated. Five negative symptom domains are frequently clustered into the motivation and pleasure (MAP) and emotional expression (EE) ‘dimensions’, but whether this structure remains stable and behaves as a single entity or not remains unclear. Study design: We examined a cohort of 153 patients taking clozapine for treatment-resistant schizophrenia in a regional mental health clinic. Patients were assessed longitudinally over a mean period of 45 months using validated scales for positive, negative and mood symptoms. Network analyses were performed to identify symptom ‘communities’ and their stability over time. The influence of common causes of secondary negative symptoms as well as centrality measures were also examined. Study results: Across patients at baseline, two distinct communities matching the clinical domains of MAP and EE were found. These communities remained highly stable and independent over time. The communities remained stabled when considering psychosis, depression, and sedation severity, and these causes of secondary negative symptoms were clustered into the MAP community. Centrality measures also remained stable over time, with similar centrality measures across symptoms. Conclusions: Our results suggest that MAP and EE are independent dimensions that remain highly stable over time in chronic schizophrenia patients treated with clozapine. Common causes of secondary negative symptoms mapped onto the MAP dimension. Our results emphasise the need for clinical trials to address either MAP or EE, and that treating causes of secondary negative symptoms may improve MAP
    corecore